Deep learning-based landslide susceptibility mapping

Landslides are considered as one of the most devastating natural hazards in Iran, causing extensive damage and loss of life. Landslide susceptibility maps for landslide prone areas can be used to plan for and mitigate the consequences of catastrophic landsliding events. Here, we developed a deep convolutional neural network (CNN–DNN) for mapping landslide susceptibility, and evaluated it on the Isfahan province, Iran, which has not previously been assessed on such a scale. The proposed model was trained and validated using training (80%) and testing (20%) datasets, each containing relevant data on historical landslides, field records and remote sensing images, and a range of geomorphological, geological, environmental and human activity factors as covariates. The CNN–DNN model prediction accuracy was tested using a wide range of statistics from the confusion matrix and error indices from the receiver operating characteristic (ROC) curve. The CNN–DNN model was evaluated comprehensively by comparing it to several state-of-the-art benchmark machine learning techniques including the support vector machine (SVM), logistic regression (LR), Gaussian naïve Bayes (GNB), multilayer perceptron (MLP), Bernoulli Naïve Bayes (BNB) and decision tree (DT) classifiers. The CNN–DNN model for landslide susceptibility mapping was found to predict more accurately than the benchmark algorithms, with an AUC = 90.9%, IRs = 84.8%, MSE = 0.17, RMSE = 0.40, and MAPE = 0.42. The map provided by the CNN–DNN clearly revealed a high-susceptibility area in the west and southwest, related to the main Zagros trend in the province. These findings can be of great utility for landslide risk management and land use planning in the Isfahan province.

. The main CNN-DNN architecture 48

Analysis method
In deep learning and data mining, the extraction of features plays an important role. These extracted features can be used for classification or prediction with high accuracy. Since spatial prediction (i.e., mapping) is crucial for a range of applications including crisis management, urban planning and geo-hazard assessment (including landslide susceptibility assessment), the coupled CNN and DNN classifier has found wide applicability 8,37,42,49 .
In the CNN-DNN classifier, the input data are evaluated by convolution, pooling, batch normalisation, dense, dropout and fully connected layers to predict the outputs (Fig. 1). The number of layers can be increased, thus, increasing the learning depth. The input data provide the first layer of evaluation as a data matrix in which each element has a specific feature value. Hence, the input layer is the primary feature map modified and organised by each convolutional layer and unit. These units extract different features from the input data. The first convolutional layer extracts some low-level features (e.g., lines, edges, corners). Further convolutional layers learn iteratively more intricate representations or features. Pooling is a critical manipulation in a CNN. Max-pooling is the most common manipulation amongst the different pooling approaches. Max-pooling aims to divide the feature maps into several rectangular zones and provide the maximum value for each zone 42 . Batch normalisation (or batch-norm) aims to increase the speed, performance and stability of the network. Batch normalisation is used to normalise the input layer by re-centring and re-scaling. The dense or regular densely-connected layer is commonly used as a linear/non-linear layer applied to the input and returned to the output. Fully connected layers connect every neuron in a preceding layer to every neuron in a subsequent layer. This is, in principle, the same as the traditional MLP network 41 . Combining these layers in the sequence can extract the desired features and, thereby, classify the input data into the desired classes.
Knowledge-based approaches have received significant attention in landslide susceptibility analyses where machine learning methods such as the CNN and DNN have provided highly accurate results. These methods, now considered common procedures, are applied to analyse visual imagery for image recognition and classification. CNNs and DNNs are regularised versions of the MLP, consisting of an input and an output layer and multiple hidden layers. The hidden layers of the CNN are typically concluded with a series of convolutional layers with multiplication or another dot product (e.g., the activation function is mostly RELU). On the other hand, the DNN finds the correct mathematical manipulation to transform the input into the output (based on linear or non-linear relationships). Each mathematical manipulation is considered a layer, and complex DNNs have many layers 40,41 . Since 2019, the application of the CNNs and DNNs in landslide susceptibility analyses has led to establishment of the potential of deep learning for landslide susceptibility mapping 32,42,46 . More widely, implementation of the coupled CNN-DNN has led to increased accuracy compared to the implementation of these two methods separately. We, thus, develop a coupled CNN-DNN methodology to assess landslide susceptibility.
Study site and data. Study location. The study area is located in the Isfahan province of central Iran and covers an area of approximately 106,786 km 2 (Fig. 2). Markazi, Qom and Semnan provinces are located to the north, and the Fars and Kohgiluyeh-Boyerahmad provinces are located to the south of the Isfahan province. The city of Isfahan, which is the capital of the Isfahan province, is considered to be the historical, cultural and touristic capital of Iran. The Isfahan province experiences a moderate and dry climate that ranges from 10.6 to 40.6 °C annually (the average annual temperature has been recorded as 16.7 °C). The annual rainfall of Isfahan has been recorded to range from 16.5 to 217.3 mm, with an average annual rainfall of 116.9 mm 50 . Figure 2 provides the location of 222 historical landsides that were identified during comprehensive field survey and from areal imagery. Geologically, the study area is located on a plain with rocky outcrops and mountains towards the north-western and south-western parts. The Zagros suture zone is the result of a collision between the Arabian and Central Iran tectonic plates 51 . The main tectonic trend in the region follows the Zagros Mountains. The trend is aligned NW-SE and has affected the geo-structures in the region, including fault orientations, folding and shear zone formation 52 . Although most of the study area is covered by Quaternary sediments, the geological formations in the region include late Triassic rocks 53 . The geo-structures in the region can lead to different sliding and land-movement activities. The most important reasons for landslides in this province relate to tectonic  Landslide covariates. The selection of a set of influencing factors is considered a key step in landslide susceptibility analysis 56 . Both full-length field surveys and remote sensing observations were acquired to provide a detailed landslide assessment of the study area. During the field surveys, 222 historic landslides in the study area were identified to determine landslide-prone areas. Several triggering factors, as used in numerous studies on machine learning-based landslide susceptibility modelling, categorized into several groups 57,58 , were used as landslide conditioning factors. The selection of the triggering factors required several considerations related to the dependency of triggering elements, measurability, non-redundancy and relevance of geological characteristics. The main factors influencing landslide occurrence were identified by preparing a spatial landslide inventory database that included the spatiotemporal distribution of historical landslides and a set of potential influencing factors. As a result, four main groups of factors were identified as the most effective elements that triggered landslide movements, including geomorphologic (i.e., altitude variation, slope aspect, slope curvature, profile curvature), geological (i.e., geo-units, distance to faults, land-use, soil type, hydrologic variation, slope-dip), environmental (i.e., climate, watershed, drainage pattern, vegetation) and human activity-related (i.e., distance to roads, distance to cities) covariates. These covariates were identified based on expert knowledge from fieldwork and remote sensing imagery. Table 1 provides information about the selected covariates used in this study. Before these data can be used in susceptibility modelling, it could be subject to multicollinearity and correlated variables 21 . The multicollinearity is a phenomenon in which one predictor variable in a regression model can be predicted linearly from others. To test for multicollinearity variance inflation factors (VIF) are commonly used 21,59 . A VIF > 5 indicates potential multicollinearity. In this article, all selected triggering factors produced VIF values less than 2.1 ( Table 1).
Data preparation. In this research, four groups of covariates were considered for landslide susceptibility analysis. The inventory-based dataset was prepared using a digital elevation model (DEM) and Landsat TM (5)(6)(7)(8), and ETM + satellite sensor imagery provided by the Geotechnology Unit, Department of Geological Engineering, Middle East Technical University. The dataset included 222 recorded historical landside locations that were retrieved from technical documents, fieldwork and areal images taken from landsliding sites, checked using GPS coordinates and site-survey. The predictive models were fitted based on both landslide and non-landslide cells (i.e., where landslides did not occur). The flat plain area in Isfahan province was considered as contributing non-landslide cells (112 points in the dataset) mostly located in the east of the province. According to Huang et al. 33 , three methods exist for attaining non-landslide grid cells: the seed-cell procedure, random selection and flat locations (slope lower than 2°). This study used random selection, while including flat locations as well (due to the geomorphological condition of the province). After providing the main database, this database was divided into training and testing sets (80% and 20% of the information from the ground survey, respectively). The training set comprises 60% landslide − 40% non-landslide; while testing set comprises 55% landslide − 45% non-landslide. Figures 3, 4, 5 and 6 present maps of the landslide covariates to support visual assessment of the performance of the various methods tested. The ArcGIS v10.4 software was used to produce the landslide susceptibility maps. All evaluated spatial data were converted to spatially defined layers to produce the landslide susceptibility maps. The proposed algorithm was implemented in the Python high-level programming language. www.nature.com/scientificreports/ The results of the CNN-DNN evaluation were extracted as shapefiles and used as information layers in a GIS environment.

Methodology.
The study was conducted in several stages. First, ground survey was performed to estimate and record historical landslides in the study area. Second, by considering both the feature extraction of the CNN and the classification capabilities of the DNN, it was possible to identify highly susceptible (high risk) areas, potentially with high accuracy. In the next stage the model tested by using performance criteria, error models and the ROC curve. This research evaluated the suitability of the proposed CNN-DNN method to produce detailed landslide susceptibility maps for the Isfahan province in Iran. The performance of the CNN-DNN was evaluated against several high-quality benchmark approaches through a range of appropriate statistical measures. A total of 15 landslide covariates, falling into four main groups, were fed into the CNN-DNN. All covariate layers were normalised and then entered into the model to standardise and prepare the information for landslide susceptibility analysis. The CNN was used for feature extraction, and the DNN was used to sort pixels into the high-susceptible and low-susceptible groups. Table 2 provides the hyperparameters used in the study. Hyperparameters are commonly used to optimize the fitting process which can increase the machine learning model prediction accuracy 21 . The objective of selecting the hyperparameters is to optimize the evaluation values 38,61 . Different optimisers were used for the hyperparameters, noting that some optimizers provide more accurate results than others 61 . The presented study used the grid search technique for the assessments. The hyperparameters that provide the highest accuracy were chosen for the final training and testing of the respective machine learning models 21 . Figure 7 presents a flowchart describing the process applied for susceptibility assessment. As seen in the figure, the landslides dataset includes 222 historical cases and field survey recordings divided randomly into training (80%) and testing (20%) datasets. The database consists of the landslide inventory datasets (training and validation) and the landslide triggering factors. These factors were subsequently evaluated by calculating their weights from the relationship between the landslide occurrences and landslide triggering factors and then these results were checked 62 . There is no standard for the selection of triggering factors in susceptibility mapping, but the chosen factors have to be measurable depending on a particular area's characteristics 63 . As mentioned, the test and train datasets represented 20-80%, respectively, of the primary database, taking their spatial distributions into account. Considering the test/train ratio is important for the model learning rate, that is, the response to the estimated error each time the model weights are updated. In fact, the learning rate controls how quickly the model is adapted to the problem. Smaller learning rates require more training epochs as smaller changes are made to the weights at each update, whereas larger learning rates result in rapid changes and require fewer training epochs. Specifically, the learning rate is a configurable hyperparameter used in the training of neural networks that has www.nature.com/scientificreports/ a small positive value, often in the range between 0.0 and 1.0. The learning rate used in this study was selected by optimizers, which for 0.01 and no momentum were scheduled via callbacks in Keras support. Pearson's Phi coefficient was used to assess the susceptibility classification, and each of the landslide influencing factors was used in this process. The coefficient takes into account true and false positives and negatives. It is generally a balanced measure that can be used even if the class proportions are of very different sizes. Figure 8 presents the Pearson's coefficients for each layer. These information layers constituted the landslide dataset and were input to the CNN to extract more informative features for susceptibility assessment. These feature representations were then used in the DNN model to produce the susceptibility map. As is well known, some of the covariates, such as land cover, can be accepted by the proposed method, but some others, such as land use, must be modified before input to the CNN-DNN. In this regard, we used the class weight argument in the Keras package to select a large weight for unbalanced classes in such factors (to produce balanced values).
This classification is based on a set of influencing factors (which cover extrinsic and intrinsic elements) trained on historical landslide occurrences (a total of 222 landslides) characterising very high and high susceptibility zones. The historical landslide data were prepared and extracted from shapefiles implemented in a GIS environment and evaluated for each input factor.
To assess the proposed methodology rigorously, its accuracy was evaluated using statistics from the OA and ROC and compared with the accuracies of common machine learning methods, including the SVM, LR, GNB, MLP, BNB and DT classifiers. From the confusion matrix, the mean squared error (MSE), root mean square error (RMSE) and mean absolute percentage error (MAPE) were used to measure the model accuracy. In this regard,   Therefore, both precision and recall are based on measures of relevance 41 . The false-positive rate can be calculated as '1-specificity' , where specificity is defined as: Accuracy can be a misleading metric for imbalanced datasets. For example, for a prediction set with 95 positive and 5 negative values, classifying all values as negative gives a 0.95 accuracy score. On the other hand, the F1-score, the harmonic mean of precision and recall, provides approximately the average of the two values when they are close and is more generally the harmonic mean.
The overall accuracy (OA) represents the probability that a test will correctly classify an individual; that is, the sum of TP plus TN divided by the total number of the individuals tested: OA is, thus, also the weighted average of 'sensitivity' and 'specificity' (Aggarwal, 2018). The application of the performance matrix helps to characterise the trustworthiness of the classifier in question.

Results
The proposed models. Landslide hazard susceptibility assessment was conducted by applying the proposed CNN-DNN methodology to evaluate landslide susceptibility in the study area (Fig. 9). The OA and ROC controlled the result of the proposed model. The ROC curve is a graphical description that shows the diagnostic ability of a binary classifier system as its discrimination threshold is varied. As a result, the OA and AUC from the ROC curve represent the accuracy of the classifiers. Figure 10 presents the OAs and loss models for the CNN-DNN. These figures show that the estimated OA is 0.909, and the loss is reduced to 0.20 in 5000 epochs. According to the evaluated hazard map presented in Fig. 9, susceptible and hazardous areas in the west and southern parts of the Isfahan province manifest spatially as visual stripes. DNN optimisers such as SGF, RMSprop and Adagrad estimated the modified IRs for the CNN-DNN (Figs. 11, 12, 13). IRs can be used as the AUC value of OA to control the performance of the algorithm. These stripes follow the northwest-southeast trace, which represents the main Zagros trend in the region. Therefore, it can be suggested that geological factors have had the most significant impact on landslide occurrence in the Isfahan province.

Benchmark comparison.
To evaluate the performance of the proposed CNN-DNN model rigorously, a large range of state-of-the-art and widely applied machine learning techniques were tested using the same accuracy statistics as applied to the proposed method. These benchmark methods include the SVM, LR, GNB,   Table 3 shows the values of the classification metrics, the proposed model performed more accurately than all six benchmarks in all metrics. The proposed model produced the highest rate of ROC accuracy with a value of 90%. After the proposed model, the decision tree classifier achieved the next best performance, with accuracy approximately 5% lower than for the proposed model. The lowest estimated accuracy of 50% was achieved by the MLP and BNB, which is 40% less than for the proposed model. Regarding accuracy criteria, the proposed model produced an accuracy of 84.8%, and the closest algorithm (SVM) to the proposed model was approximately 4.7 less than the proposed model. The MLP produced the lowest accuracy of 61%. The average    www.nature.com/scientificreports/ precision for the two susceptibility classes of the proposed model is 84%. The lowest average accuracy of the MLP and BNB is 34% (a difference from the proposed model of more than 50%). The average recall rate for the proposed model is 88%, and the minimum recall rate is 55% for BNB. For the F1-score, the average is 85.5% for the proposed model, and, as for the other three criteria examined, this is the largest value amongst all models. The DT algorithm produced the next largest F1-score, with an average value of 83.5% and a difference of almost 2% less than the proposed model.  www.nature.com/scientificreports/

Discussion
We investigated the potential of a coupled deep neural network (CNN-DNN) to predict landslide susceptibility spatially. The algorithm was evaluated using data with a spatial resolution of 30 m representing the Isfahan province, Iran. Indices associated with historical landslide occurrences (a total of 222 landslides) were used as the landslide inventory dataset, and this was divided randomly into training (80%) and testing (20%) sets for the analysis. Four main covariates, including geomorphologic, geologic, environmental and human activityrelated covariates, were identified based on field and remote sensing investigations. The CNN-DNN model was able to produce a susceptibility map for the study area with appropriate accuracy. The results show a significant increase in landslide susceptibility prediction accuracy compared to the benchmark models. Notwithstanding the high accuracy achieved by the proposed CNN-DNN predictive model for landslide susceptibility mapping, this study has some limitations that could be considered in future research. Theses limitation can be addressed as: (i) the primary database was provided based on fieldwork, historical landslide records and remote-sensing information. The limited number of reference landslides in the recorded data (as is commonly the case) made modelling challenging; (ii) the data on the triggering factors were highly dependent on the spatial resolutions of satellite sensor imagery and DEM data quality, which affected directly the quality of the input database; (iii) the predictive model required strong processors to manage the inputs during landslide susceptibility assessments. Thus, for future scientific research involving, for example, even finer spatial resolution images the adequacy of the available processors needs to be considered for landslide susceptibility analysis. Referring to Fig. 9, which presents the landslide susceptibility assessment results in the study area, it is clear that the main risk area lies in the west and southwest part of Isfahan. Geo-structural studies suggest that the high-susceptible areas are located in the Zagros folded zone and follow the main Zagros trend in the province. Thus, it can be stated that the geological-based triggering factors play important roles in determining landslide occurrence in Isfahan. Fieldwork suggested the effect of geo-structures as triggers of landslide movements. It is interesting then that the CNN-DNN model was able to provide detailed mapping to corroborate this.
The   8,32,42,44 . This indicates that the CNN can be used as a basic predictive model. However, the coupled CNN-DNN model in this paper was able to increase the accuracy further, to reach 0.909. The CNN-DNN method uses a first-stage CNN component that attempts to extract meaningful semantic information from low-level input covariates that may be related to the target for prediction, in this case, landslide susceptibility classes. The results suggest that the first-stage CNN is efficient in extracting suitable environmental features related to landslide susceptibility. This is important because it is unclear whether landslides should be considered as spatially continuous phenomena or spatial objects [18][19][20][26][27][28][29]64 . On the one hand, landslides are complex geomorphological processes manifested as changes in states in space and time, including variation within the landslide (rupture zone and impacted area). Thus, at a fine spatial scale, one might consider a continuous statistical model appropriate for landslide susceptibility mapping. On the other hand, landslides create discrete rupture zones and impacted areas that appear against a landscape background. In this sense, and at a scale where variation between landslides becomes more important, landslides can be considered discrete objects.
The problem with the above duality between the continua-and object-based views of the world becomes obvious when considering the characterisation of existing landslides and prediction of yet-to-occur landslides. Landslides do not occur at a pixel, but rather occupy some positive area. As such, conventional methods, which are commonly pixel-based, insufficiently characterise the landslide as a spatially extensive phenomenon. They also run into difficulties in predicting yet-to-occur landslides because predictions of susceptibility are constrained to a pixel. The CNN-DNN model deals directly with these two problems by analysing spatial patches of data rather than pixels.
The second problem we leave as an open question for future research. Specifically, the CNN-DNN can transform the spatial information in the input covariates into meaningful higher-order feature representations about landslide susceptibility. This makes sense concerning landslides when one considers the conditions that may lead to failure. These conditions are often spatial, requiring not the conditions at a point to be satisfied, but the conjunction of several conditions over an area to be satisfied. For example, it may not be enough for the slope at a single point to be high. Landsliding may be more likely if that same high slope falls in the context of surrounding land which, for example, concentrates water to that point (e.g., via overland flow or throughflow). This requirement for context is true of many of the in situ factors that underpin the susceptibility of a location to fail. Thus, the CNN-DNN approach proposed in this research is an excellently matched algorithm to the specific characteristics of the landsliding phenomenon and problem under study.

Conclusions
Landslide susceptibility mapping is one of the most challenging tasks in geo-hazard assessment. In this context, application of modern deep learning techniques can be advantageous for analysis. Here, we applied a novel CNN-DNN predictive model for assessment of landslide susceptibility in Isfahan province, Iran. The model was fitted between historical landslides data (which accounted for different types of landsliding) and various triggering factors. The proposed CNN-DNN model produced a very high accuracy, outperforming a wide range of benchmark approaches, specifically the SVM, LR, GNB, MLP, BNB and DT methods. More specifically, the CNN-DNN (AUC = 90.9%; IRs = 84.8%) achieved greater prediction accuracy than the corresponding single classifiers such as SVM (AUC = 81.5%; IRs = 80. We, thus, recommend the CNN-DNN approach for landslide susceptibility mapping. Importantly, the CNN component of the approach has great advantages for landslide susceptibility mapping precisely because it matches well, and takes advantage of, the spatially extensive nature of the landslide phenomenon itself.
The CNN-DNN model predicted a high-susceptibility zone in the west and south-western parts of the study area, appearing as a stripe aligned with the northwest-southeast main Zagros trend in the region.